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SYSTEM AND METHOD FOR ANALYZING 
AUTOMATIC SPEECH RECOGNITION PERFORMANCE DATA 



TECHNICAL FIELD OF THE INVENTION 

The present invention relates in general to 
information handling systems and, in particular, to a 
system, a method, and a program product for analyzing 
automatic speech recognition performance data. 
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BACKGROUND OF THE INVENTION 

Automatic speech recognition (ASR) technology has 
improved greatly in recent years, and various companies 
are beginning to use it to provide customer service, such 
5 as in interactive voice response (IVR) systems. Multiple 
vendors offer different forms of ASR technology, and 
customers may desire to analyze and compare competing ASR 
products before selecting a particular ASR product for 
implementation. 

10 For example, a company may desire to evaluate and 

compare selected ASR products or systems by conducting 
usability studies in which individuals, such as customers 
of the company, interact with the selected ASR systems by 
telephone. Each ASR system may interpret the 

15 participant's utterances and produce a log file detailing 
each event that occurs during a call. The ASR log files 
would thus contain ASR performance data. As recognized 
by the present invention, logs files produced by ASR 
products are difficult to analyze because of their 

20 content and form. The present invention addresses that 
difficulty. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

A more complete understanding of the present 
invention and advantages thereof may be acquired by 
referring to the appended claims, the following 
5 description of one or more example embodiments, and the 
accompanying drawings, in which: 

FIGURE 1 depicts a block diagram of an example 
embodiment of a system for analyzing automatic speech 
recognition (ASR) performance data, according to the 
10 present disclosure ; 

FIGURE 2 depicts a table that portrays example 
dialog involving an example ASR system; 

FIGURE 3 depicts a table that portrays an example 
log file to be processed by the ASR performance data 
15 analysis engine of FIGURE 1; 

FIGURE 4 depicts an example embodiment of a user 
interface produced by the ASR performance data analysis 
engine of FIGURE 1; 

FIGURE 5 depicts a flowchart of an example 
2 0 embodiment of a process for analyzing ASR performance 
data, according to the present disclosure; 

FIGURE 6 depicts a table of example events, 
indicator strings, and automated processes according to 
the present disclosure; and 
2 5 FIGURE 7 depicts an example embodiment of 

interpretation results generated by the ASR performance 
data analysis engine of FIGURE 1. 
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DETAILED DESCRIPTION OF EXAMPLE EMBODIMENT ( S ) 

The logs files produced by ASR products are 
difficult to analyze. Consequently, even though 
usability studies may be performed in which customers 
5 interact with different ASR systems by telephone, it is 
difficult to evaluate the performance of an ASR system 
and to compare the performance of different ASR systems. 

This document describes example embodiments of a 
system, a method, and a program product for analyzing ASR 

10 performance data. Advantages of various embodiments of 
the present invention may include making it easier to 
evaluate the performance of an individual ASR system and 
easier to compare the performance of different systems. 
FIGURE 1 depicts a block diagram of an example 

15 embodiment of a system for analyzing ASR performance 
data, according to the present disclosure. As 
illustrated, a data processing system 10 may include one 
or more central processing units (CPUs) 20 that 
communicate with input/output (I/O) components, data 

20 storage components, and other components via one or more 
system buses 50. The I/O components in data processing 
system 10 may include one or more communications 
adapters, such as network adapter 32, for communicating 
with remote systems or components, such as external data 

25 storage 36, via a network 34. The I/O components may 
further include one or more I/O modules 3 0 in 
communication with I/O devices such as a display 58, a 
keyboard 52, and a mouse 54. The data storage components 
may include various volatile and non-volatile data 

30 storage devices. For instance, data processing system 10 
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may include one or more volatile data storage devices, 
such as random access memory (RAM) 24 , and non-volatile 
internal data storage 22, such as one or more hard disk 
drives. Thus, data processing system 10 may use internal 
5 data storage 22, external data storage 36, or a 
combination of internal and external data storage 
devices . 

According to the illustrated embodiment, data 
processing system 10 may be used to host an ASR product, 

10 and also to analyze the performance of that ASR product. 
However, in alternative embodiments, separate data 
processing systems may be used for those two functions. 

During a study or trial, a human participant may 
interact with the ASR product or system by telephone. 

15 Each call can be characterized as a series of one or more 
dialogs or exchanges between the ASR system and the 
participant. The participant's utterances may include 
responses to verbal prompts, such as questions or 
instructions, generated by the ASR system. In general, a 

2 0 system generated prompt and the corresponding response, 

if any, may be referred to collectively as an "individual 
exchange." Verbal interactions between the ASR system 
and a participant may be referred to in general as 
"dialog. " 

25 FIGURE 2 depicts a table that portrays example 

dialog involving an example ASR system. For instance, in 
a dialog, the ASR system may play a recording to prompt 
the participant. The participant may then respond with 
an utterance. The ASR system may then attempt to 

3 0 interpret the response using ASR. Based on the 
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interpretation, or the failure to interpret, the ASR 
system may repeat the prompt, play a new prompt, end the 
call, or take some other action. Referring again to 
FIGURE 1, the ASR system may include an ASR engine 40 
5 that interprets the participant's utterances and produces 
a log file 42 detailing each event that occurs during the 
trial . 

FIGURE 3 depicts a table that portrays an example 
log file 42 from ASR engine 40. As illustrated, log file 

10 42 may include many different kinds of information. Much 
of the information may not be necessary for the purposes 
of a particular study. Log file 42 may also be formatted 
in a way that makes it difficult for a person to locate 
the information necessary for a particular study. The 

15 formatting may also make it difficult to decipher or 

understand that information. In addition, the log file 
may not directly contain all of the data of interest to a 
reviewer. For instance, it may be necessary to compute 
certain types of data, such as the duration of certain 

20 events, based on information in log file 42 such as start 
times and end times. 

Furthermore, multiple trials may be run to generate 
performance data from multiple ASR systems, and multiple 
participants may interact with each of those ASR systems. 

2 5 Consequently, the reviewer may be faced with the task of 
analyzing and comparing a large number of log files 42 . 

As illustrated in FIGURE 1, data processing system 
10 includes an analysis engine 44 for automatically 
processing ASR performance data, such as the data in log 
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file 42. Analysis engine 44 may also be referred to as 
an ASR log analysis engine 44 . 

Data processing system 10 may also include a call- 
type file 43 and a substitutions file 45. As described 
5 in greater detail below, analysis engine 44 may use call- 
type file 43 and substitutions file 45 to process log 
file 42. For instance, call-type file 43 may contain 
event definitions that help analysis engine 44 interpret 
log file data, and substitutions file 45 may contain 
10 predefined replacement definitions to be applied to log 
files . 

In the illustrated embodiment, programs or 
applications such as analysis engine 44 may. be copied 
from internal data storage 22 into RAM 24 for execution. 

15 Likewise, data to be processed, such as call-type file 

43, substitutions file 45, log file 42, or parts thereof, 
may be copied to RAM 24 for processing. Programs or data 
may also be retrieved by data processing system 10 from 
external data storage 36. 

2 0 In operation, analysis engine 44 may generate one or 

more user interface screens to allow the user to set 
various parameters and execute various functions. For 
instance, FIGURE 4 depicts an example embodiment of a 
user interface screen 60 produced by analysis engine 44. 

25 Screen 60 may also be referred to as a control panel 60. 

As illustrated, control panel 60 allows the user to 
select and open input files and output files. In 
addition, control panel 60 allows the user to select and 
open call-type files and substitutions files. By 

30 selecting a file, the user may specify the file to be 
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processed or generated by analysis engine 44 . Opening a 
file may cause the file to be opened in a new window, 
possibly in a new application, such as a spreadsheet or 
word processing application. 
5 Multiple substitutions files may be predefined, with 

each including data that analysis engine 44 may use to 
replace specified strings with specified replacement 
strings, during the process of analyzing or interpreting 
a selected log file. The different substitutions files 

10 may be used for processing log files from different ASR 
systems. Similarly, multiple call-type files may be 
predefined, with each including data that analysis engine 
44 may use to interpret a selected log file. The 
different call-type files may be used to interpret log 

15 files from different ASR systems. Different call type 
files may associate different event indicator strings 
with the same event or event identifier. 

Control panel 6 0 may also allow the user to save the 
current settings, to specify whether column headers 

2 0 should be sent to the output file, and to initiate 
processing of the specified log file. A progress 
indicator may also be provided. 

FIGURE 5 depicts a flowchart of an example 
embodiment of a process for analyzing ASR performance 

2 5 data, according to the present disclosure. The 

illustrated process is described with reference to 
operations performed by data processing system 10. The 
process starts with a user having utilized control panel 
60 to select log file 42 as the input file, output file 

30 46 as the output file, call-type file 43 as the call-type 
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file, and substitutions file 45 as the substitutions 
file. 

At block 100, analysis engine 44 receives a command 
to start processing, for instance in response to a user 
5 clicking on process button 62 in FIGURE 4. In response, 
analysis engine 44 may copy log file 42 into RAM 24, and 
analysis engine 44 may automatically scan through that 
copy of log file 42, performing string substitutions, in 
accordance with the predefined replacement definitions in 

10 substitutions file 45, as shown at block 102. For 
example, analysis engine 44 may change "prompt 
audio/70 .vox" to "Greeting," and analysis engine 44 may 
change "prompt audio/11 .vox" to "Opening prompt." 

Analysis engine 44 may then begin an iterative 

15 process of automatically extracting and translating data 
from the modified log file 42 in RAM 24, based on call- 
type file 43. For instance, as depicted at block 104, 
analysis engine 44 may load call-type file 43 into RAM 
24. As shown at block 106, analysis engine 44 may begin 

20 stepping through each line in log file 42 to find a 
relevant event . 

For example, call -type file 43 may contain a number 
of event definitions. An event definition may include a 
string that is known to correspond to a certain type of 

25 event in log files such as log file 42. Such strings may 
be called event indicator strings. Analysis engine 44 
may search for those strings when processing each line in 
log file 42. An event definition may also include a 
standard identifier for a particular type of event or 

3 0 data, linking that type of event or data to a specific 
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event indicator string. Analysis engine 44 may disregard 
any event in log file 42 that is not specified in call- 
type file 43 . 

As shown at block 110, analysis engine 44 may 
5 determine whether the end of log file 42 was reached. If 
the end of the log file was not reached, analysis engine 
44 may process the event that was found, as shown at 
block 112. For example, as events are found, analysis 
engine 44 may store various values pertaining to the 

10 performance of the ASR system that produced log file 42, 
and may compute various relevant performance metrics. 
Consequently, as described in greater detail below, 
analysis engine 44 may find and interpret specific, 
predefined call events or characteristics in log file 42, 

15 based on the event definitions. Analysis engine 44 may 
also extract relevant values from log file 42, based on 
the event indicator strings defined in call-type file 43. 
For example, analysis engine 44 may extract the start 
time and date for the call, as well as a participant or 

20 customer identifier ("ID"). 

FIGURE 6 depicts a table of some example events, 
indicator strings, and automated processes according to 
the present disclosure. When processing the first line 
of log file 42, for example, analysis engine 44 may 

25 recognize the event indicator string "call_ start , 11 and, 
in response, record the call start time from the 
timestamp in that line as a characteristic for a 
standardized "call start" event. Those operations are 
represented by row 1 in FIGURE 6 . 
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As indicated in row 2, analysis engine 44 may then 
recognize the event indicator string "log: Subject ID=" in 
the third line of log file 42, and, in response, record 
"Ginny" as the pertinent characteristic. As depicted in 
5 row 3, analysis engine 44 may then recognize the event 
indicator string "prompt audio/70 .vox" in the fifth line 
of log file 42, and, in response, extract the start time 
for that prompt from the timestamp in that line. 
Alternatively, the fifth line of log file 42 may include 

10 the event indicator string "Greeting" instead of "prompt 
audio/7 0 .vox, " pursuant to the substitution process 
described above, and analysis engine 44 may recognize and 
process "Greeting" as the event indicator string. 

Although FIGURE 6 illustrates examples of various 

15 types of events that may be recognized and processed by 
analysis engine 44, many additional events, such as 
"prompt end" events, "input end" events, and other events 
or values, may also be recognized and processed by 
analysis engine 44, in accordance with the approach 

2 0 described herein. For instance, analysis engine 44 may 
recognize the event indicator string "prompt__end done:" 
in line six of log file 42, and, in response, may extract 
the end time from the timestamp in that line. Analysis 
engine 44 may then compute and record the prompt duration 

25 for the greeting, based on the pertinent start time and 
end time. 

Further examples of the performance metrics or 
characteristics that may be extracted or computed by 
analysis engine 44 may include, without limitation, the 
30 following : 
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• the duration of each individual exchange; 

• the duration of each prompt; 

• the duration of the caller's utterance; 

• the latency or time interval between the end of the 
5 prompt and the beginning of the caller's utterance; 

• the results of the attempts by the ASR system to 
recognize the caller's utterance; 

• the duration of the entire call. 

Additional event definitions may be predefined, with 

10 event indicator strings for events relevant to any 
particular project. 

The data values and performance metrics that are 
recognized or generated during the process of finding and 
processing events may be referred to in general as 

15 interpretation results. According to the example 

embodiment, some or all of the interpretation results may 
ultimately be saved in output file 46, in internal data 
storage 22 or in external data storage 36, displayed on 
display device 58, and/or printed. 

2 0 FIGURE 7 depicts example interpretation results that 

may be generated by analysis engine 44 and saved in 
output file 46. The interpretation results may be 
categorized as general characteristics and exchange 
characteristics. The general characteristics and the 

2 5 exchange characteristics may also be referred to as 

general analysis results and exchange analysis results, 
respectively. The general characteristics may include 
values that pertain to an entire call, such as the ID of 
the participant, the duration of the call, etc. In 
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FIGURE 7 , general characteristics and labels for those 
values are presented in the first two rows. According to 
the example embodiment, exchange characteristics may 
include values pertaining to individual exchanges within 
5 a call. The labels and values starting at row four in 
FIGURE 7 may represent exchange characteristics. 

For example, the duration of individual exchanges 
are depicted under the heading "DLGDUR. 11 The duration of 
prompts are depicted under the heading " PRMTDUR . 11 The 

10 results of attempts by the ASR system to recognize speech 
are depicted under the heading "RECRSLT . " Identifiers or 
names for the different messages played by the ASR system 
are listed under the heading 11 PROMPTNAME . " In addition, 
a dialog may include a group of prompts. Thus, there may 

15 be several prompt names within a given dialog name listed 
under the heading "DLGNAME" . Analysis engine 44 may 
extract the prompt names, the dialog names, and other 
data from log file 42 after some or all of those names 
have been provided pursuant to the substitution process 

20 described above. Data that indicates whether the prompt 
played to completion or whether, instead, the prompt 
ended early may be depicted under the heading 
11 PRMTENDTYPE . 11 For instance, if a caller barges in with 
a response while a prompt is still playing, the ASR 

25 system may terminate the prompt as soon as it detects the 
speech. 

Referring again to FIGURE 5, after analysis engine 
44 processes an event, analysis engine 44 may search log 
file 42 for the next event, as indicated by the arrow 
30 returning to block 106 from block 112. As indicated at 
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block 120, after all of the lines in log file 42 have 
been analyzed, analysis engine 44 may compute the general 
analysis results, such as the duration of the entire call 
listed under the heading "CALLDUR, " and the total amount 
5 of time spent playing prompts. Other general analysis 
results may include, without limitation, the following: 

• an identifier for the participant or subject under 
"SUBID, " 

• an identifier for a particular call to distinguish 

10 multiple calls from the same subject under " CALLNUM , " 

• identifiers for different call designs being tested 
under "PATH," 

• the date and time the call started, 

• the duration of time spent in dialog between the caller 
15 and the ASR system under "ASRDUR, " 

• the elapsed time between termination of the ASR dialog 
and pick up by the operator under "GETOPDUR, " 

• the time duration for "storing and forwarding" the 
information received from the subject to the operator 

2 0 under "SAFDUR," and 

• the time duration spent with the subject connected to 
the live operator under "LIVEDUR. " 

As illustrated at block 122, analysis engine 44 may 
25 then save the interpretation results to output file 46. 
The automated interpretation process may then end. A 
user may then open output file 46, for instance by 
selecting the "Open Output File" button on control panel 
60 . 
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By using the approach described above, analysis 
engine 44 may generate output file 4 6 with format and 
content that may be understood by a person with relative 
ease, compared to log file 42. Output file 46 may omit 
5 unnecessary information, and may include results that 
were computed by analysis engine 44, possibly without 
including the data values used in such computations. In 
one embodiment, the interpretation results may reproduce 
less than half of the data from the log file. In 

10 alternative embodiments, the interpretation results may 

reproduce less than seventy- five percent of the data from 
the log file. 

Although the present invention has been described 
with reference to various example embodiments, those with 

15 ordinary skill in the art will understand that numerous 
variations of those embodiments could be practiced 
without departing from the scope and spirit of the 
present invention. For example, one of ordinary skill 
will appreciate that alternative embodiments could be 

20 deployed with many variations in the number and type of 
components in the system, the network protocols, the 
system or network topology, the distribution of various 
software and data components among the data processing 
systems in the network, and myriad other details (e.g., 

25 the length of various fields or columns, the number of 
columns, and other characteristics of the output.) 
without departing from the present invention. 

It should also be noted that the hardware and 
software components depicted in the example embodiment 

30 represent functional elements that are reasonably self- 
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contained so that each can be designed, constructed, or 
updated substantially independently of the others. In 
alternative embodiments, however, it should be understood 
that the components may be implemented as hardware, 
5 software, or combinations of hardware and software for 
providing the functionality described and illustrated 
herein. In alternative embodiments, information handling 
systems incorporating the invention may include personal 
computers, mini computers, mainframe computers, 
10 distributed computing systems, and other suitable 
devices . 

In alternative embodiments, the trial of the ASR 
system may be performed by one data processing system, 
and the performance analysis may be performed by a 

15 different data processing system, with reference to the 
results from the trial. Similarly, one or more of the 
components illustrated as residing in internal data 
storage may instead reside in external data storage. 
Alternative embodiments of the invention also 

2 0 include computer-usable media encoding logic such as 

computer instructions for performing the operations of 
the invention. Such computer-usable media may include, 
without limitation, storage media such as floppy disks, 
hard disks, CD-ROMs, read-only memory, and random access 

25 memory; as well as communications media such as wires, 
optical fibers, microwaves, radio waves, and other 
electromagnetic or optical carriers. The control logic 
may also be referred to as a program product. 

Many other aspects of the example embodiment may 

30 also be changed in alternative embodiments without 
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departing from the scope and spirit of the invention. 
The scope of the invention is therefore not limited to 
the particulars of the embodiments or implementations 
illustrated herein, but is defined by the appended 
claims . 
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